Self-supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation

نویسندگان

  • Gregory Kahn
  • Adam Villaflor
  • Bosen Ding
  • Pieter Abbeel
  • Sergey Levine
چکیده

Enabling robots to autonomously navigate complex environments is essential for real-world deployment. Prior methods approach this problem by having the robot maintain an internal map of the world, and then use a localization and planning method to navigate through the internal map. However, these approaches often include a variety of assumptions, are computationally intensive, and do not learn from failures. In contrast, learning-based methods improve as the robot acts in the environment, but are difficult to deploy in the real-world due to their high sample complexity. To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based. We then instantiate this graph to form a navigation model that learns from raw images and is sample efficient. Our simulated car experiments explore the design decisions of our navigation model, and show our approach outperforms single-step and N -step double Q-learning. We also evaluate our approach on a real-world RC car and show it can learn to navigate through a complex indoor environment with a few hours of fully autonomous, self-supervised training. Videos of the experiments and code can be found at github.com/gkahn13/gcg

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)

In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...

متن کامل

Learning to Navigate by Growing Deep Networks

Adaptability is central to autonomy. Intuitively, for high-dimensional learning problems such as navigating based on vision, internal models with higher complexity allow to accurately encode the information available. However, most learning methods rely on models with a fixed structure and complexity. In this paper, we present a self-supervised framework for robots to learn to navigate, without...

متن کامل

Mobile Robot Online Motion Planning Using Generalized Voronoi Graphs

In this paper, a new online robot motion planner is developed for systematically exploring unknown environ¬ments by intelligent mobile robots in real-time applications. The algorithm takes advantage of sensory data to find an obstacle-free start-to-goal path. It does so by online calculation of the Generalized Voronoi Graph (GVG) of the free space, and utilizing a combination of depth-first an...

متن کامل

Self-learning navigation algorithm for vision-based mobile robots using machine learning algorithms

Many mobile robot navigation methods use, among others, laser scanners, ultrasonic sensors, vision cameras for detecting obstacles and following paths. However, humans use only visual (e.g. eye) information for navigation. In this paper, we propose a mobile robot control method based on machine learning algorithms which use only camera vision. To efficiently define the state of the robot from r...

متن کامل

PQ−Learning: An Efficient Robot Learning Method for Intelligent Behavior Acquisition

This paper presents an efficient reinforcement learning method, called the PQ-learning, for intelligent behavior acquisition by an autonomous robot. This method uses a special action value propagation technique, named the spatial propagation and temporal propagation, to achieve fast learning convergence in large state spaces. Compared with the approaches in literature, the proposed method offer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1709.10489  شماره 

صفحات  -

تاریخ انتشار 2017